-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only allow users to specify a single location per parse call & pass full location data to schema.load #420
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just drive-by's for now. Good work! Thanks for being so thorough with the docs.
I'll give this a proper lookover when I have more time to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed this a bit more. I'm basically good with the approach here.
I cut 5.5.0, so you'll need to update with dev
. Feel free to merge instead of rebase; I intend to squash and merge when this is all done, if you don't mind.
I also made a 5.x-line branch so that we can maintain support for 5.x for some time after we release 6.0.
Should we rename Just a thought: doesn't need to happen in this PR. |
d5c0429
to
176a9f7
Compare
I don't mind changing it or keeping it the same. There isn't a huge difference between these names. One advantage of renaming is that it would immediately fail for webargs v5 users who are using |
Up to you. Feel free to do the |
I'll probably do it in here, since it's already a large, breaky change. Something that I've started to realize poses an issue: the way that I've been writing things, such that webargs treats an empty JSON payload as The issue is that I'm working on a refactor now in which Part of the goal is to ensure it's easy to add something like def load_json_or_form(self, req, schema):
data = self.load_json(req, schema)
if data is missing:
data = self.load_form(req, schema)
return data and I think this achieves that. This will have a public-facing impact on how webargs v6 behaves, but I think generally for the better. It will make it easier to use webargs in APIs with optional request bodies. |
Er, to clarify, I think the above is important to be very clear and open about because it constitutes another reversal on #297 (back to the current behavior) The first version of this PR changed the decision on that saying that One of the big cases that changes my mind is the idea of an endpoint with an optional JSON body. |
As of f005c40 , this is finally be passing tests, as all of the parser conversions are finished. And I've just merged This is what I'd consider the very minimum to be safe to merge. But I still plan to do two more things before removing
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. Let me know when you're confident in merging this.
I've just removed As an important final change, I was trying to add I've added b8d3fee , which makes the decision to check content-type on the frameworks where this was not done (django and pyramid) and return |
Great. I probably won't be able to review this until this weekend at the earliest. Thanks for all your work on this. |
Just as a short FYI: I'm going to be taking a couple of weeks off, and likely won't be able to respond to any feedback for the next two weeks or so anyway. So please don't worry if you write some comments and don't see a reply -- this isn't abandoned, and I'm still invested in seeing it through. |
In light of the fact that it motivates b8d3fee , I've added the |
Thank you so much @sirosen for tackling this and sorry for the so long delay. I've tried to port flask-smorest to your webargs branch and I only had two adaptations to make:
Those are known breaking changes. My point is that I didn't discover any issue. Then I tried to port the application we're working on. I had to
Again, nothing surprising. I shall now actually review the code, but at least it proved functional on our real-life cases. Thanks again. |
I'll take some time soon (hopefully next few days) to revisit the changelog and make sure it's accurate. In particular
made me realize that two parsers have changed (the Anyway, please don't worry about taking time with this! I more than understand how hard it can be to get time to work on open source projects and properly review big PRs. 🙂 |
Thanks. I'm using Flask, BTW, so I guess this also changed on |
I could be wrong but here's my understanding:
So the breaking change would only be the default location checked for input. |
I just amended the changelog for 6.0 based on a reread of the changes. Listed as a bugfix (because it makes webargs internally consistent), several parsers now require
There are new changes in |
This is early documentation of planned changes, for this branch of work. Note the change from `locations=...` to `location` in Parser.use_[kw]args methods, and that fields may no longer specify a location. Additionally, detail that schemas now load all data from a location, not only those for which there are matching fields. The changelog explains how to rewrite a schema which loads from multiple locations into multiple `use_[kw]args` calls, and how to use `unknown` to get webargs v5-like behavior.
Stated goals in the form of CHANGELOG update Gut significant chunks of webargs.core - replace "parse_{location}" with "load_{location}" "locations=<collection>" with "location=<string>" "location_handler" with "location_loader" - remove "parse_arg" "get_value" - add webargs.multidictproxy which uses a Schema to handle MultiDict types and behave similarly to webargs.core.get_value As a proving ground for this implementation/approach, get flaskparser tests to pass under ma3 Fixes as the location loading code undergoes refactoring for the flaskparser: - Restore missing mimetype check for JSON payloads - Treat form data as a multidict without extraneous AttributeError check (after testing, flask will return '{}' for form data when it is missing or empty -- just respect/allow this) - req.headers is a multidict type as well - fix some mistakes with new argument ordering Changes to the base test app at src/webargs/testing.py of note: - Because each route can only specify that it loads data from a single location (unless you pull extra tricks), several testing cases have had their routes modified, e.g. "/echo" -> "/echo_json" - New test, test_parse_json_list_error_malformed_data added to check that webargs does not incorrectly treat non-list data as singleton lists when passing data to a schema with list fields. The previous behavior was questionable before, but with full data going to schemas it definitely becomes incorrect. - Many tests which previously expected success with malformed data ignored now expect 422 errors. For example, sending a non-dict JSON value (e.g. `1`) should fail because it cannot be loaded by the schema. It should *not* be silently ignored because data was provided and failed to parse. This is a major behavior change for webargs. Supporting modifications were made in tests/apps/flask_app.py , including the addition of `ma.EXCLUDE` when the detected marshmallow version is >= 3. After work on the relevant PR and trying things both ways, the current handling of empty JSON bodies (interpreting them as {}) is being kept externally, but changed internally. This relates to the decision made in marshmallow-code#297 . Originally in the course of work on marshmallow-code#419, it seemed like it might be necessary to change the behavior to make missing JSON data behave as `missing` externally. However, making the change internal and wrapping it to present the same external interface is best. load_* methods are expected to return `missing` or {} when data is missing for the given location. e.g. When a JSON body is "" it should be `missing`. This is considered correct. However, in terms of the external behavior of webargs, no data should be interpreted as `{}`. In order to achieve this, the wrapper which calls the various "loader_func"s to load location data explicitly checks the result for `missing` and converts it to `{}` if it is missing. This makes it easy to use webargs to implement an API with optional JSON data, but internally preserves the ability to detect a missing JSON payload and distinguish it from a payload containing `{}`.
If an empty body is submitted to a webargs parser, it should always be checked for content-type. Otherwise, the behavior framework-to-framework for webargs will not be consistent. Namely: djangoparser and pyramidparser now check content-type for JSON and return missing when an emtpy/missing payload is used. This makes the detection/use of JSON payloads more consistent and is necessary for `json_or_form`.
'json_or_form' is defined as first trying to load the JSON data, then falling back to form data. Adds a test to the testsuite for this which tries sending JSON, tries sending Form, and tries sending no data, verifying that we get the correct result in each case. This does *not* try to support more elaborate behaviors, e.g. multipart bodies, which probably will work variably well or poorly based on the framework in use. In these cases, the content type will not detect as JSON (because it is 'multipart'), and we'll failover to using form data. This is acceptable, since anyone using `json_or_form` is expecting to allow form posts anyway. Docs cover not only the new json_or_form location, but also the *idea* of a "meta location" which combines multiple locations. This is worth documenting in the advanced docs since it is the way to get back certain behaviors from webargs v5.x
This is not a deep and comprehensive rewrite which aims to discuss/explain the functionality which is now available. Rather, this change merely takes everything in the docs which has become inaccurate and trims or modifies it to be accurate again. Add notes to changelog about making parsers more consistent about checking Content-Type for JSON payloads.
1d68733
to
6b92b2a
Compare
I've just rebased and pushed a cleaned-up version of this branch which works against My attempt to handle this by merging things yesterday failed because I did not properly incorporate #428 . Because this PR introduces the idea of a shared |
Thanks. Rebase is better. Sometimes it is harder to achieve (too many changes and too many commits in the branch to rebase), so a merge is acceptable. If rebase is easier for you, then that's just great. I'd really like to merge this. I'll try to review this as soon as I can. Can't promise when. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I made a few comments.
I didn't review the async parsers.
I lean towards releasing as 6.0 rather than as a 6.0b pre-release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to get to the review feedback this past weekend, but it didn't happen. I'll try to find a good time for it soon. There's one item where I had a question to make sure I correctly understand what's wanted.
1. Fix a typo in the quickstart doc Rewrite "content_type" to "user_type". Just a slip-up. 2. Flatten out the nesting in MultiDictProxy.__getitem__ Although this was flagged as an `else-return` which could flatten out to just a return, Jérôme also suggested a version which treats the outer condition as an `if ...: return` so that we can dedent the rest of the logic. The end-result is easier to read, so let's go with that. 3. Add a test for passing a single value to a List field This applies to all parsers, but it was required by FalconParser in particular. If you remove MultiDictProxy from the presentation of query params, headers, etc. from FalconParser, it works in most cases. However, because field values only get wrapped in lists by that parser if there are multiple values supplied, we need some intermediary to understand that lookups for List fields always need a `list`-type value. MultiDictProxy serves that purpose, as a schema-aware dict type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed a commit with fixes for the several issues found in the last review.
I'm going to do a final double-check on everything and (hopefully) mark all of the outstanding comments as resolved.
Previously, `load_json` carefully avoided putting `missing` into the `Parser._cache["json"]` storage location. However, we're not clear (at least, in the context of marshmallow-code#420 ) on why we should avoid caching `missing` values. No tests fail, so change this to be simpler logic in which any "successful" JSON parsing, including those which failover to `missing`, will be cached.
Refer to the `__location_map__` attribute, and expand the list of supported locations to be complete (add `query` and `json_or_form`).
Okay, I missed a couple of small things, so I handled those in follow-up commits. I'm excited about merging this in part because I want to get cracking on the changes to |
Great. @sloria, should we release a beta to give frameworks time to adapt to the changes? Then we'd release a final 6.0.0 with Python 2 dropped. |
Haven't had time to review the latest changes, so I'll defer to @lafrech on merging this. Releasing a 6.0.0b1 makes sense to me |
Alright. Merging. I'll try to release a beta version today. |
@sloria, I just got this error again.
Am I missing privileges on this repo? |
@lafrech I just gave you admin privs. Can you try again? |
This is still incomplete, but core and flaskparser tests are now passing. No work is (yet) included on other concrete parsers, but I fixed the base asyncparser to make mypy linting pass.
Resolves #419, #267, #164, #268
Obsoletes, and therefore closes, #410
I've updated the changelog with a basic "statement of intent". As I work on this, I will try to maintain it as an accurate record of how this change should look to the outside world. Obviously, narrative docs also need to be updated with relevant info.
Highlights / greatest hits:
DEFAULT_LOCATIONS
withDEFAULT_LOCATION="json"
parse_{location}
withlocation_load_{location}
locations=<iterable>
withlocation=<str>
locations=...
from FieldsParser.parse_arg
webargs.core.get_value
webargs.multidictproxy.MultiDictProxy
which hasget_value
-like behavior and proxies various dict-like objects/echo
which loads data from a bunch of locations)1
for a JSON body will now parse and be passed literally to the schema, which will fail because1
is not a valid input forschema.load
). Changes to the tests serve as a somewhat-readable list of behavioral changes.To try to keep this from sprawling even more, my intent is to address some items mentioned in #419 either in follow-up PRs or as part of this only if we're happy with this direction (and agree on those items). In particular: the
json_or_form
location I proposed and nesteduse_args
usage and error collection.@sloria, @lafrech, my biggest question is: do you like where this is going? Should I just hammer out all of the other parsers?